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SUMMARY 


A  robust  version  of  Akalke's  model  selection  procedure  for  regression  models 
is  introduced  and  its  relationship  with  robust  testing  procedures  is  discussed 


Some  fcetf  woaxia:  Akalke  Information  Criterion;  Cp  criterion;  M-estimators; 
Robust  tests;  Regression  models. 


1.  INTRODUCTION 


The  Akaike  Information  Criterion  is  a  powerful  tool  for  choosing  among 
different  models  that  can  be  used  to  fit  a  given  data  set.  If  we  denote  by 
Lp  the  log-likelihood  of  the  model  with  p  parameters,  this  amounts  to  choose 
the  model  that  minimizes  -2Lp+2P  .  This  procedure  may  be  viewed  as  an 
extension  of  the  likelihood  principle  and  is  based  on  a  general  Information 
theoretic  criterion.  In  fact  2Lp-2P  is  a  suitable  estimate  of  the  expected 
entropy  of  the  model  and  by  the  Akaike  Criterion  the  entropy  will  be,  at 
least  approximately,  maximized;  cf.  Akaike  (1973). 

Bhansali  and  Downham  (1977)  proposed  to  generalize  the  Akaike  Criterion 
by  choosing  the  model  that  minimizes  for  a  given  fixed  a 

AIC(p;a)  *  -2Lp+o*p  .  (1) 

Several  proposals  have  been  made  for  choosing  a  ;  see,  for  instance, 

Bhansali  and  Downham  (1977),  Atkinson  (1980).  If  we  apply  (1)  to  a  linear 
regression  model 

yi  ■  xje  +  ei  ,  1*1,. ..,n  (2) 

2 

with  n  Independent  Identically  normally  distributed  errors  with  variance  o 
AIC(p;o)  *  K(n,o)  +  Rp/o*  +  crp  (3) 


2 


A 

where  K(n,o)  Is  a  constant  depending  on  the  marginal  of  the  x^'s  ,  o  is 

o  n  ta  a 

some  estimate  of  a  and  R_  =  £  (y .-x le_)  is  the  residual  some  of  squares 

P  ■jsj  1  TP 

with  respect  to  the  least  squares  estimate  §p  .  AIC(p;2)  is  equivalent 
to  Mallows'  Cp  statistic;  see  Mallows  (1973). 

One  of  the  main  goals  of  robust  statistics  is  to  find  new  statistical 
procedures  that  are  not  Influenced  too  much  by  small  deviations  from  the 
distributional  assumptions  of  the  model.  In  recent  years  there  has  been 
a  considerable  amount  of  work  directed  to  construct  robust  estimators  and 
testing  procedures  for  regression  models,  but  the  aspects  related  to  a 
robust  model  choice  have  been  somewhat  neglected.  Since  the  AIC  statistic 
for  regression  models  is  a  direct  consequence  of  the  normality  assumption 
on  the  errors'  distribution  (see  (3)),  we  cannot  use  it  in  this  form  with 
robust  estimators  and  robust  tests.  The  purpose  of  this  note  is  to  intro¬ 
duce  a  robust  selection  procedure  for  regression  that,  first,  allows  us  to 
choose  the  model  which  fits  the  mujoKity  of  the  data  taking  Into  account 
that  the  errors  might  not  be  exactly  normally  distributed,  and  secondly, 
that  can  be  used  consistently  with  new  robust  estimators  and  tests. 

In  Section  2  the  new  robust  procedure  is  Introduced  and  its  relation¬ 
ship  to  robust  testing  procedures  Is  discussed.  Section  3  presents  some 
possible  choices  of  the  parameter  a  for  the  robust  selection  procedure. 


3 


2.  A  ROBUST  SELECTION  PROCEDURE 


Let  us  assume  that  the  errors  in  (2)  follow  some  distribution  with 
density  g  .  Then  the  right  hand  side  of  (1)  becomes 


K(n,o)  -  2  .Z  log  gUy^xj^.pJ/a)  +  a'P  .  (4) 


where  T  „  denotes  the  maximum  likelihood  estimator  of  0  when  the  errors 
n;p 

distribution  Is  g  .  If  we  replace  -log  g  in  (4)  by  a  general  function  p 
we  obtain  the  following  robust  selection  procedure.  Note  that  a  similar 
Idea  was  used  by  Martin  (1980)  for  autoregressive  models. 

For  a  given  constant  a  and  a  given  function  p  ,  chooses  the  model 
that  minimizes 

n 

AICR(p;a,p)  *2  Z  p(r.  )  +  op  ,  (5) 

1*1  1>p 


where  r.  n  *  (y.*-xlT„.n)/o  ,  a  Is  some  robust  estimate  of  a  and 

IfP  1  llltP 

Tn.p  Is  the  M-estlmator  defined  as  Implicit  solution  of  the  system  of 
equations 


1  *  0  * 
1-1  1,p  1 


with  f(r)  -  df/dr  . 


4^-, 4 


lT4 


The  extension  of  AIC  to  AICR  Is  the  exact  counterpart  of  that  of 
maximum  likelihood  estimation  to  M-estimation;  cf.  Huber  (1981,  Section  3.2). 
In  particular.  If  we  choose  p  as  Huber's  function 

pc(r)  *  r2/2  if  |r|  <  0  (7) 

*  c | r |  -  c2/2  otherwise  , 

then  Tn.  is  Huber's  estimator  and  AICR  (p;a,p  )  is  the  generalized 
n  jp  L 

Akaike  statistic  (1)  computed  under  the  least  favorable  errors'  distribution 
with  density 

90(r)  *  (l-e)(2ir)"^exp(-pc(r))  ,  (8) 

where  c  Is  a  function  of  the  contamination  e  ;  cf.  Huber  (1981,  Chapter  4). 
In  this  case  a  robust  estimate  for  o  can  be  obtained  using  Huber's  Proposal  2 
(Huber  1981,  p.  137)  or  Hampel's  median  absolute  deviation  (Hampel  1974, 
p.  388)  In  the  model  with  all  parameters. 

Let  us  now  Investigate  the  relationship  between  AICR  and  robust  testing 
procedures.  Denote  by  the  jth  component  of  the  vector  e  and  let 

HQ:e^  *  0  ,  j  *  q+l,...,p 

be  the  null  hypothesis  in  the  model  (2).  Denote  by  A  the  likelihood  ratio 
test  statistic  and  define 


£q>p  =  lo9  A  * 


Then  it  is  easy  to  see  that 


la  «  a  -  (P-q)“  (AIC(p;ot)  -  AIC(q;ct))  . 


If  we  substitute  the  likelihood  ratio  test  statistic  l  by  a  robust 

q,P 

version,  namely 


*q°d  =  2(p-q)'1(D(R)-D(F))  , 

H  *r 


where  D(F)  is  the  minimum  value  of  I  p(r.  )  and  D(R)  is  the 

n  1=1  1>p 

minimum  value  of  I  p(r.  )  subject  to  H_  ,  the  dispersion  of  the  residuals 
1=1  1,p  0 

under  the  full  and  reduced  models  respectively  (see  Schrader  and  Hettmansperger , 
1980;  Ronchetti,  1982),  we  obtain 


p  =  a  -  (p-qr^AICRfp-.a.p)  -  AICR(q;a,p)) 


(12)  is  the  natural  counterpart  of  (10)  when  using  robust  estimators  and  test. 
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3.  CHOICE  OF  THE  PARAMETER  a 


In  this  section  we  propose  a  choice  for  the  parameter  a  in  AICR(p;a,pc)  . 
It  is  based  on  the  following  result  due  to  Stone  (1977). 

The  Akaike  statistic  AIC(p;2)  is  asymptotically  equivalent  to 

-2Lp  +  trace ,  (13) 

where  -M«  is  the  (pxp)  matrix  of  the  second  derivatives  (with  respect  to  0  ) 
of  the  log-likelihood  function  and  is  the  (pxp)  matrix  of  the  products 
of  the  first  derivatives.  Since  AICR(p;a,pc)  can  be  viewed  as  the  Akaike 
statistic  computed  under  the  least  favorable  errors'  distribution  gQ 
(see  (8)),  we  obtain 

Mj  *  E^  •  ExxT 

M2  =  E4»c  ‘  ExxT 

where  <>c(r)  «  dp/d r  =  r  If  Jr|  <  c 

*  c*sign(r)  otherwise  . 

Thus,  2  tracetM^Hj)  ■  2(E^/E^)p  and  we  propose  to  choose  a=ac  *  2E^/E^  <  2 
Note  that  *  2  and  AICR(p;aee,peo)  *  AIC(p;2)  which  is  the  classical 
Akaike  statistic  under  normality. 


■jl’/V.V. 


Remark 


Hampel  obtains  another  choice  for  a  "by  adding  the  average  decrease 
n 

of  l  p(rj)  and  the  average  increase  of  the  total  mean  square  error  of  fit 
i=l  1 

due  to  a  superfluous  parameter  under  normality"  (Hampel,  1983).  His  choice 
for  a  i s 


a  =  E^/E^  +  E*|/(E^)2 

that  differs  little  from  2  for  the  usual  values  of  c  (e.g.  c  between  1.3 
and  1.6). 
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